spec-6.5: cluster-aware backup / restore / PITR substrate#14
Merged
Conversation
d8d77a2 to
7c241e0
Compare
added 8 commits
July 1, 2026 09:55
46dede6 to
c13ec81
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the spec-6.5 catalog/shmem/wire/manifest/PITR-target substrate, with mutating physical backup and restore-point entry points failing closed until the missing correctness primitives land.
pg_cluster_backup_start,pg_cluster_backup_stop,pg_cluster_create_restore_point, and read-only status/history/restore-point/PITR views.feature_not_supportedinstead of creating an unsound restore point, publishing a partial manifest, or relying on naivecluster_scn_current()snapshots.--disable-clusterbuilds isolated: cluster SQL symbols remain linkable, while cluster-only runtime behavior is behindUSE_PGRAC_CLUSTER.Merge Readiness / Fail-Closed Contract
This PR is intentionally a fail-closed substrate and is not a usable physical backup, restore, or PITR implementation.
The current merge order allows this fail-closed spec-6.5 substrate to land before spec-6.0a. That does not relax the runtime contract:
pg_cluster_backup_start,pg_cluster_backup_stop,pg_cluster_create_restore_point, and any real backup/restore/PITR success path must continue to returnFEATURE_NOT_SUPPORTEDwhile physical capture, durable per-thread WAL pinning, restore-point commit-drain/fence, restore replay integration, PITR replay/open semantics, SCN high-water restore, and RESETLOGS/incarnation handling are absent.Future work may wire the real success paths only after those correctness primitives exist and are covered by multi-node backup -> restore -> recover -> read tests. Until then, the PR must produce no successful backup manifest, restore point, or PITR claim from incomplete state.
Review Fix
This revision addresses the critical review finding that the earlier draft returned success while the restore-point commit-drain barrier and physical capture path were not implemented.
pending_commits_empty=true/commit_fence_held=truecaller lie.cluster_scn_current()restore-point cut fallback from the mutating path.The full spec-6.5 physical backup/copy/restore/PITR execution path remains intentionally unclaimed in this PR. It still requires commit-drain/fence, durable WAL pinning, physical capture, restore replay integration, SCN high-water restore, RESETLOGS/incarnation handling, and multi-node restore-then-read e2e before it can be considered complete.
Tests
Rebased onto
origin/main4a6bcc7ab3(Merge PR #13 / spec-6.3a GRD/GES lifecycle reclaim) and replayed the 6.5 changes on top of the 6.3a lifecycle-reclaim semantics.Local validation on macOS with
--without-icubecause local ICU headers/libs are unavailable:make -s -j4make -s installmake -C src/test/cluster_unit check(136 binaries)src/test/cluster_unit/test_cluster_backupprove -I src/test/perl -I src/test/cluster_tap src/test/cluster_tap/t/332_cluster_backup_pitr.plusing the installed tree and elevated localhost bind permissionscripts/ci/check-comment-headers.shscripts/ci/run-cppcheck.sh(0 findings; baseline-diff clean)git diff --check--disable-clusterin-tree build in/private/tmp/pgrac-worktrees/linkdb-spec-6.5-disable-2: configure without--enable-cluster, thenmake -s -j4scripts/ci/check-format.shwas also run locally. It still reports pre-existing formatting violations in unrelated files outside this PR's touch set (cluster_voting_disk_io.c,cluster_grd.c,cluster_tt_local.c,cluster_ges.c,cluster_cssd.h,cluster_ic_chunk.h,cluster_itl_slot.h,cluster_gcs_block.h); the 6.5 touched files are clang-format clean.Scope / Boundaries
This PR does not implement ADG, standby read-only service, RDMA, DRM, production storage backend/fence-driver work, or storage-provider copy plumbing from spec-6.0a. It also does not change 6.0a or 6.3a branches/worktrees.
This PR is not a shippable full backup/restore/PITR implementation. It is an honest substrate PR: read-only views and pure helper contracts exist; mutating paths fail closed when required correctness proofs are absent.
This PR does not merge main, tag, mark shipped, or do release sync work.